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(57) Abstract: With audio data reduction on the basis of ISO/IEC standani 11172-3, a frame length varying by 8 bits is used at 
a sampling frequency of 44.1 kHz in order to arrive, on average, at a particular fixed data rate. The lengthening of a data frame 
is signalled by a padding bit in the header of the frames. The invention dispenses with evaluation of the padding bit Instead, the 
mean frame length L is calculated, L is rounded down to the next integer, for the subsequent frame it is first established whether the 
expected sync word for this frame appears, and, if this is so, this frame is decoded without taking into account the padding bit, but if 
the expected sync word for this frame does not appear, the decoding of the frame is started one 8-bit later without taking into account 
\^ the padding bit 
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Method and apparatus £or decoding a coded digital audio 
signal which is arranged in £rames containing headers 

The invention relates to a method and an apparatus for 
5 decoding a coded digital audio signal which is arranged 
in frames containing headers . 

Prior Art 

10 When using audio data reduction on the basis of ISO/IEC 
standards 11172-3 and 13818-3, a frame length varying 
by 8 bits is used at a sampling frequency of 44.1 kHz 
in order to arrive, on average, at a particular fixed 
data rate (e.g- 128 000 bits/sec) . The "lengthening' of 

15 a data frame is signalled by the ""padding bif in the 
header of. a frame. This method is described more 
accurately in EP-A-0402973 . The frames initially also 
contain a sync word. 

20 Invention 

The evaluation of this padding bit in the decoder can 
cause difficulties- By way of example, in highly 
optimized decoders, the digital signal processors (DSP) 

25 they contain require very sparing use of storage space. 
Since, however, the header in a frame is read at the 
start of decoding of the frame, but the value of the 
padding bit is not needed until right at the end of 
decoding of this frame, in a DSP implementation an 

30 entire storage location (an integer value of, by way of 
example, several bytes in length) is typically wasted 
on merely storing the value of the padding bit. 
It would be possible to achieve a reduction in the 
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required storage space by dispensing with the 
* padding ' , i.e. the frame lengths would always be kept 
constant even at a sampling frequency of 44.1 kHz. 
However, a particular fixed data rate of, by way of 
5 example, 128 000 bits/sec is then no longer obtained, 
but rather a value which is 0.23% lower. A decoder 
which relies upon a constant frame length always being 
used even at a sampling frequency of 44.1 kHz would no 
longer be compatible with the aforementioned ISO/IEC 
10 standard, however. 

The invention is based on the object of specifying a 
method which allows less storage space to be used but 
maintains the compatibility with the ISO/IEC standards 
15 11172-3 and 13818-3 or with similar standards. This 
object is achieved by the method specified in Claim 1. 
A decoder using this method is specified in Claim 5. 

In accordance with the invention, the data frames of 
varying length are evaluated on the basis df the 
respective length, but evaluation of the padding bit 
from the header is avoided. Since the value of the 
padding bit is normally used to ascertain the exact 
position of the start of the next frame, the invention 
involves ascertaining the start of the next frame in 
another way, namely by calculating a mean frame length 
and a rounding-down or rounding-up of this mean frame 
length to the closest integer byte values for the 
received frames. 

The advantage is that the value of the padding bit does 
not need to be stored for the entire time taken for 
decoding a frame, and hence storage space can be saved 
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more frugally. 

In principle, the inventive method relates to the 
decoding of a coded digital audio signal which is 
arranged in frames containing headers, where the header 
in a frame contains a respective information item 
regarding whether this frame has a standard length or a 
length which differs therefrom for some of the frames, 
and where the frames contain a respective sync word, 
having the following steps : 

the length-variation information regarding the 
respective frame length is not stored or evaluated; 
the approximate start of the next frame is 
determined using the following formula: 

L=N*R/fs/ SL, 
where L is equal to the length of the frames, N is 
equal to the number of samples per frame, R is equal 
to the total data rate, fs is equal to the sampling 
frequency, SL is equal to the stipulated subunit for 
indicating the frame length; 

L is rounded down to the next integer of subunit s 
SL; 

for the subsequent frame, it is first established 
whether the expected sync word for this frame 
appears ; 

if the expected sync word for this frame appears, 
this siibsequent frame is decoded without taking into 
account the length-variation information; 
if the expected sync word for this frame does not 
appear, the decoding of this subsequent frame is 
started one subunits later without taking into 
account the length-variation information. 
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In principle, the inventive apparatus relates to a 
decoder for decoding a coded digital audio signal which 
is arranged in frames containing headers, where the 
header in a frame contains a respective inforination 
5 item regarding whether this frame has a standard length 
or a length which differs therefrom for some of the 
frames, and the frames contain a respective sync word, 
where, for ascertaining the frame length, the length- 
variation information regarding the respective frame 
10 length is not stored or evaluated, and where the 
apparatus contains: 

- means for decoding the audio signal; 

a frame-start estimator in which the approximate 
start of the next frame is determined using the 

15 following formula: 

L=N*R/fs/ SL, 
where L is equal to the length of the frames, N is 
equal to the number of samples per frame, R is equal 
to the total data rate, fs is equal to the sampling 

20 frequency, SL is equal to the stipulated subunit, 

and in which L is rounded down to the next integer 
of subunit s SL; 

a sync -word checker which, for the subsequent frame, 
first establishes whether the expected sync word for 

25 this frame appears, where, if the expected, sync word 
for this frame appears, this subsequent frame is 
decoded in the decoding means without taking into 
account the length-variation information, and, if the 
expected sync word for this frame does not appear, the 

30 decoding of this subsequent frame is started in the 
decoding means one subunit SL later without taking into 
account the length- variation information. 
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Instead of evaluating the sync word, another knovm and 
expected data pattern can also be evaluated. 

Drawings 

5 

Exemplary embodiments of the invention are described 
with reference to the drawings, in which: 
Figure 1 shows two successive data frames having the 

same length; 

10 Figure 2 shows two successive data frames having 

different lengths ; 
Figure 3 shows a decoder in accordance with the 

invention. 

15 Exemplary embodiments 

In data- reducing coding and decoding methods for audio 
signals, such as in ISO/IEC 11172-3 (MPEG audio), the 
coded audio signals are stored or transmitted in data 

20 frames which respectively contain a fixed number N of 
audio samples, e.g. 1152 samples. The data frames have, 
in principle, a fixed length which is a multiple of a 
basic unit, which is called a *slot' in ISO/IEC 11172-3 
and has a length of 8 bits in the ^ layer 2' and 'layer 

25 3' variants. 

In Figure 1, each of the successive frames of the same 
length of L bytes has a header Hd which contains a sync 
word SY. The size of the subunit SL is 1 byte = 8 bits 
30 in this example . 

If audio signals having sampling frequencies fs of 32 
000 Hz or 48 000 Hz are used, then the relationship 
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between the total data rate R (in bits/sec) and the 
frame length L (in slots) is as follows: 
L = N * R / fs / 8 (1) 
Example : 

N = 1152 samples; R = 128 000 bits/sec; fs = 48 000 Hz 
gives L ^ 384 slots of 8 bits each. 

It, however, a sampling frequency of 44 100 Hz is used, 
then non-integer values for L are produced in (1) • In 
this way, the start of the next frame is determined 
only approximately. Example: 

N = 1152 samples; R = 128 000 bits/sec; fs = 44 100 Hz 
gives L = 417, 9591837 slots of 8 bits each. 

However, since a frame can only have an integer number 
of slots, a frame length which varies by 1 slot (= 8 
bits) is used at a sampling frequency of 44.1 kHz in 
order to arrive, on average, at a particular fixed data 
rate (e.g. R = 128 000 bits/sec) and is signalled, as 
described above, using the padding bit in the header. 
When the result from formula (1) is rounded down, the 
correct frame start is often obtained for a sampling 
frequency of 44.1 kHz, namely for those frames which 
have not been lengthened by 1 slot. Often, however, an 
incorrect value is also obtained for the frame start. 
If the next frame starts to be decoded at this 
incorrect point, then an error is obtained, since the 
sync word to be expected at the start of the frame 
obviously does not appear. 

Normally, decoders then switch to an error recoveiry 
mode and start a fresh complex search for a sync word. 
This typically produces a fault in the decoded output 
signal . 
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In figure 2, the first frame is one unit SL longer than 
the second frame, i.e. L+1 bytes. If decoding starts at 
the place pointed at by the pointer LPOI for the 
5 calculated and rounded-down variable h, then no sync 
word is found at that place. For this reason, a check 
is carried out one unit SL further on to determine 
whether a sync word is present, and this sync word is 
found at that place. 

10 

The invention therefore proposes, when decoding encoded 
signals having the sampling frequency 44 100 Hz or 22 
050 Hz: 

not storing or evaluating the padding bit; 
15 - determining the approximate start of the next frame 
using the foinmula (1) ; 

rounding down the result from (1) to the next 
integer; 

for the subsequent frame, first establishing whether 
20 the expected sync word or another known data pattern 

appears; 

if this is the case, decoding this subsequent frame 
without taking into account the padding bit; 
if this is not the case, starting the decoding of 
25 this subsequent frame one slot later without taking 

into account the padding bit . 

Figure 3 shows an inventive decoder which receives a 
coded audio signal EAS which is supplied to a bit 
30 stream deformatter BSD. BSD interchanges corresponding 
data with a frame-start estimator FSE. The frame start 
address estimated therein or a corresponding pointer 
LPOI is used to establish, in a sync word checker SYCH, 
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whether there is a sync word at the appropriate point 
in the data stream. If this is so, a decoder stage DEC 
and/or the bit stream deformatter BSD receives the 
information which prompts the further processing or 
decoding of the next data frame to start at that point . 
The audio signal decoded in the frequency domain is 
supplied by the decoder stage DEC to a windowing stage 
DW which multiplies portions of the audio signal using 
a synthesis filter, for example, converts them to the 
time domain and outputs a decoded audio signal DAS. 

The invention can also be used for related applications 
in which a non- integer result from (1) causes a 
variation in the frame length and said variation is 
indicated using an information item similar to a 
* padding bit' . 
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Patent Claims 

1. Method for decoding (BSD, DEC, DW) a coded digital 
audio signal (EAS) which is arranged in frames 
5 containing headers (Hd) , where the header in a frame 

contains a respective information item regarding 
whether this frame has a standard length (L) or a 
length (L+1) which differs therefrom for some of the 
frames, and where the frames contain a respective 
10 sync word (SY) , characterized by the following 

steps : 

the length-variation information regarding the 
respective frame length is not stored or evaluated; 
the approximate start of the next frame is 

15 determined (FSE) using the following formula: 

L=N*R/fs/ SL, 
where L is equal to the length of the frames, N is 
equal to the number of samples per frame, R is equal 
to the total data rate, fs is equal to the sampling 

20 frequency, SL is equal to the stipulated subunit for 

indicating the frame length; 
- L is rounded down (FSE) to the next integer of 
subunit s SL; 

for the subsequent frame, it is first established 
25 (SYCH) whether the expected sync word for this frame 

appears ; 

if the expected sync word for this frame appears, 
this subsequent frame is decoded (DEC, DW) without 
taking into account the length-variation 
30 information; 

if the expected sync word for this frame does not 
appear, the decoding (DEC, DW) of this subsequent 
frame is started one subunit later without taking 



into account the length-variation information. 

Method according to Claim 1, where the parameters 
for calculating the formula for the approximate 
frame start comprise known parameters in a 
transmission system. 

Method according to Claim 2 , where at least one of 
the parameters is transmitted in the header of 
frames • 

Method according to one of Claims 1 to 3, where, 
instead of establishing whether an expected sync 
word appears, it is established whether another 
known pattern appears in the next frame. 

Apparatus for decoding a coded digital audio signal 
(EAS) which is arranged in frames containing headers 
(Hd) , where the header in a frame contains a 
respective information item regarding whether this 
frame has a standard length (L) or a length (L+1) 
which differs therefrom for some of the frames, and 
the frames contain a respective sync word (SY) , 
where, for ascertaining the frame length, the 
length- variation information regarding the 
respective frame length is not stored or evaluated, 
and where the apparatus contains: 

means (BSD, DEC, DW) for decoding the audio signal 
(EAS) ; 

a frame-start estimator (FSE) in which the 
approximate start of the next frame is determined 
using the following formula: 

L=N*R/fs/ SL, 
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where L is equal to the length of the frames, N is 
equal to the number of samples per frame, R is equal 
to the total data rate, fs is equal to the sampling 
frequency, SL is equal to the stipulated subunit, 
and in which L is rounded down to the next integer 
of subunit s SL; 

a sync- word checker (SYCH) which, for the subsequent 
frame, first establishes whether the expected sync 
word for this frame appears, where, if the expected 
sync word for this frame appears, this subsequent 
frame is decoded in the decoding means without 
taking into account the length- variation 
information, and, if the expected sync word for this 
frame does not appear, the decoding of this 
subsequent frame is started in the decoding means 
one subunit SL later without taking into account the 
length- variation information- 
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